Picture for Xiangjun Fan

Xiangjun Fan

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Add code
May 31, 2026
Viaarxiv icon

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Add code
May 31, 2026
Viaarxiv icon

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

Add code
May 21, 2026
Viaarxiv icon

Agentic Recommender System with Hierarchical Belief-State Memory

Add code
May 14, 2026
Viaarxiv icon

Synthetic Sandbox for Training Machine Learning Engineering Agents

Add code
Apr 06, 2026
Viaarxiv icon

GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification

Add code
Mar 31, 2026
Viaarxiv icon

LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems

Add code
Mar 26, 2026
Viaarxiv icon

TARo: Token-level Adaptive Routing for LLM Test-time Alignment

Add code
Mar 19, 2026
Viaarxiv icon

ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning

Add code
Mar 10, 2026
Viaarxiv icon

DREAM: Where Visual Understanding Meets Text-to-Image Generation

Add code
Mar 03, 2026
Viaarxiv icon